Add Section 11: Path to Main Track Publication (Parallel Track) #975

abrichr · 2026-01-17T05:42:39Z

Summary

Adds a new Section 11 to the publication roadmap document that provides a rigorous and honest assessment of what would be required to publish in a main track venue (NeurIPS, ICML, ICLR) rather than a workshop
Includes honest evaluation of why current work is workshop-level (prompt engineering, not ML research)
Provides four technical contribution options with effort estimates to elevate the work
Details additional experiments, timeline, resources, and honest recommendations

Changes

11.1 Honest Assessment: Why Current Work is Workshop-Level

Core contribution is prompt engineering, not ML research
Table of anticipated reviewer concerns with severity levels

11.2 Required Technical Contributions (Options to Elevate)

Option A: Learned Demo Retrieval (2-3 months, RECOMMENDED) - Train retrieval to optimize action accuracy
Option B: Learned Prompt Synthesis (3-4 months) - Learn optimal demo formatting/compression
Option C: Behavioral Cloning with Demo-Augmentation (4-6 months) - Fine-tune VLM with demo attention
Option D: Theoretical Analysis (2-3 months) - Information-theoretic analysis

11.3 Additional Experiments Required

Full WAA (50+ tasks, 3 seeds), WebArena (100+ tasks)
Episode success rate, multi-model comparison, ablation studies
Statistical significance requirements

11.4 Timeline and Resources

Minimum 6-7 months for main track
1-2 dedicated researchers (FTE)
$2-5k GPU compute, $1-3k API credits

11.5 Honest Recommendation

Small team: Focus on workshop paper
Dedicated resources: Pursue Option A (Learned Retrieval)
Clear guidance on when NOT to attempt main track

11.6 Additional References

REALM, Atlas, DocPrompting (retrieval-augmented learning)
APE, DSPy, PromptBreeder (automatic prompt engineering)
CogAgent, SeeClick, RT-2 (GUI agent fine-tuning)

Test plan

Verify markdown renders correctly on GitHub
Verify Table of Contents link works
Review content for accuracy and completeness

Generated with Claude Code

This section provides a rigorous and honest assessment of what would be required to elevate the current work from workshop-level to main track publication at venues like NeurIPS, ICML, or ICLR. Key additions: - 11.1: Honest assessment of why current work is workshop-level (prompt engineering, not ML research) with table of reviewer concerns - 11.2: Four technical contribution options to elevate the work: - Option A: Learned Demo Retrieval (RECOMMENDED, 2-3 months) - Option B: Learned Prompt Synthesis (3-4 months) - Option C: Behavioral Cloning with Demo-Augmentation (4-6 months) - Option D: Theoretical Analysis (2-3 months) - 11.3: Additional experiments required (WAA 50+ tasks, WebArena 100+, multi-model, ablations, statistical significance) - 11.4: Timeline and resource estimates (6-7 months minimum, 1-2 FTE, $5-10k compute/API costs) - 11.5: Honest recommendation based on team resources - 11.6: Additional references (REALM, Atlas, DocPrompting, APE, DSPy, CogAgent, SeeClick, RT-2) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

abrichr merged commit d4b7ca0 into main Jan 17, 2026
6 checks passed

abrichr deleted the feature/publication-roadmap-main-track-path branch January 17, 2026 05:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add Section 11: Path to Main Track Publication (Parallel Track) #975

Add Section 11: Path to Main Track Publication (Parallel Track) #975

Uh oh!

abrichr commented Jan 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Add Section 11: Path to Main Track Publication (Parallel Track) #975

Add Section 11: Path to Main Track Publication (Parallel Track) #975

Uh oh!

Conversation

abrichr commented Jan 17, 2026

Summary

Changes

11.1 Honest Assessment: Why Current Work is Workshop-Level

11.2 Required Technical Contributions (Options to Elevate)

11.3 Additional Experiments Required

11.4 Timeline and Resources

11.5 Honest Recommendation

11.6 Additional References

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants